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Abstract. This paper is part of a larger project that examines some of the best-selling 
iPhone apps designed to learn English pronunciation. Informed by the literature on 
pronunciation teaching/acquisition, Computer Assisted Pronunciation Teaching (CAPT), 
Computer Assisted Language Learning (CALL) and Mobile-learning (M-leaming), it 
provides a critical evaluation of the strengths and limitations of iPhone apps designed to 
improve the user’s English pronunciation autonomously. The language learning potential 
of the apps is weighed up, appraising the aspects of pronunciation addressed by each app 
(individual phonemes, stress, intonation, et cetera). The paper concludes that iPhone apps 
have a great potential to practise and improve certain aspects of English pronunciation, 
such as sound discrimination, the learning of English phonemes, or the pronunciation 
of individual words, and it explores prospective improvement of existing apps in the 
future. The paper identifies feedback as one of the main limitations of current apps, 
while acknowledging that these limitations could be overcome relatively easily with 
existent technology. It also shows directions for future development of iPhone apps for 
pronunciation teaching so far neglected, such as the teaching of suprasegmental features 
or communicative practice. 
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1. Introduction 

Pronunciation is one of the most challenging aspects of language to master for language 
learners, given that it entails not only mental capacities but also psycho-motor and 
perceptual abilities (MacCarthy, 1978, p. 2; Witt & Young, 1997, p. 1). 

Because pronunciation is such a demanding competence, and since it is often 
compromised in the classroom due to time constraints, technologies seem to be the 
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ideal support for pronunciation teaching. CAPT enhances presentation styles and 
makes materials more ‘psychologically accessible’ (Pennington, 1996, p. 1), it provides 
private, stress-free environments which allow unlimited tries and different types of 
output with different voices and models (Godwin-Jones, 2009, p. 5), as well as the 
possibility to access virtually unlimited input and to address individual problems 
(Busa, 2008, p. 165; Neri, Cucchiarini, Strik, & Boves, 2002, p. 1), or the provision of 
immediate feedback without needing the physical proximity of a teacher (Erben, Ban, 
& Castaneda, 2009, p. 74). 

Today’s smart phones are a sort of Swiss-army-knife that proffer countless 
possibilities, ranging from reading emails to tracking a run via GPS. Thus, why not use 
them to learn English pronunciation? I have focused on Apple’s iPhone because it is the 
one with the widest range of apps devised to tech pronunciation. 

2. State of the app 

What makes smart phones so versatile is the number of ‘apps’ at their disposal which 
add new functions to the phone. However, there seems to be a shortage of apps dealing 
with English pronunciation. As Colpaert (2004) points out, in the history of CALL, 
hype has only been achieved when amateurs, not trained professionals, have been able 
to develop their own applications. 

Apps devoted to teaching pronunciation can be divided in two groups: those devised 
to learn some aspect of pronunciation and those that function as reference tools. 

2.1. Reference apps 

Some of these apps allow users to look up the pronunciation of a number of words 
and sentences and hear them pronounced, such as Pronounce English AZ , HowJsay, 
English as it is broken or FORVO ; while others, like iPron , include a phonemic chart 
with the symbols and their pronunciation. Some even allow users to record their own 
pronunciation. Nevertheless, they do not incorporate any activities or practice, nor do 
users receive feedback on their performance. 

2.2. Pronunciation training apps 

These apps teach some aspects of English pronunciation and usually provide a range 
of activities to practice. The six apps analysed here pursue different goals. Besides 
fostering sound discrimination, English File Pronunciation, Phonetic Focus 
and Sounds teach the sounds of English with their phonemic symbols, possible 
spellings and pronunciations, while Pronunciation Power and Enunciation focus on 
articulation, and Clear Speech deals with discrimination of final sounds, word stress 
and syllable awareness. 

The first three apps introduce the symbols with interactive sound charts which 
demonstrate their pronunciation in different positions (therefore showing their possible 
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distributions too), and in EFP , also in sentences. EFP only has two activities, one 
for sound discrimination and another one to check users’ knowledge of the symbols 
(Figure 1). Just like CS and Sounds, it keeps a record of users’ scores so that they can 
concentrate on areas they may need to reinforce. 

Figure 1. 



PF is the app with the widest variety of activities and presentation styles. It includes four 
tools to learn the sounds and eight activities to practise, such as sound discrimination 
exercises, tasks aimed at finding missing phonemes, reading transcriptions aloud, or 
spotting mistakes in phonemic transcriptions (Figure 2). However, the questions always 
appear in the same order. 


Figure 2. 



Sounds incorporates three types of activities (Figure 3): read (users read phonemic 
transcriptions and write their orthographic forms), write (users read words and 


83 




































Jonas Fouz Gonzalez 


transcribe them phonemically), and listen (users listen to words and transcribe them 
phonemically). It is the app that allows for more user control. Users can select: the 
model of English (British or American), the particular phonemes they want to practise 
with, the number of questions, and even choose between three minutes or three lives to 
complete the game. Moreover, it is the only app that offers the option of buying more 
packages with extra words and sentences. 


Figure 3. 
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Enunciation and Pronunciation Power have a different goal; they concentrate 
on production and illustrate how to articulate English sounds through videos and 
animations (Figure 4). Moreover, they include recordings of a range of words with 
the sounds in different positions. Enunciation also contains the sounds in sentences 
and it allows users to record their voice. However, even though their aim is to help 
learners to pronounce the sounds, they do not incorporate any means by which users 
can truly ‘practise’ what they produce, nor do they provide any feedback on their 
performance. 

Pronunciation Power , while not targeting phonemes as such, does make use of 
phonemic symbols. Enunciation , on the other hand, illustrates the pronunciation of I'wl 
under the label of “long E”, or /e:/ as “A-2”, for instance, mixing orthographic spelling 
with phonemic symbols (Figure 4 on the right). As Pennington (1994) recommends, 
approaches that encourage equivalence through orthographic or simplified phonemic 
representations of the L2 sounds should be avoided, since they invite interference with 
LI sounds. Reading “long-E” will not mean the same to a Spanish speaker than to an 
English speaker, for example. 

Finally, Clear Speech is the only app which addresses suprasegmental features. It 
incorporates two sections devoted to practising sound discrimination of final sounds, 
one for word stress, and another one for syllable awareness (Figure 5). 
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Figure 4. 



As for the activities dealing with final sounds, ball toss is a sound discrimination game 
in which users are presented two minimal pairs below a pin and they have to ‘aim 
for the pin’ they hear; and stop or flow works on the distinction between continuing 
and stopping sounds, illustrating this contrast with the metaphor of a tap which either 
closes with stopping sounds, or opens with continuing sounds (as articulators will when 
producing these). 

The two activities that address suprasegmental features are: basketball and push 
the blob. Basketball is devoted to helping users distinguish the number of syllables in 
words and sentences. Users listen to words and sentences and they have to ‘bounce’ a 
ball as many times as syllables they hear, which also helps users understand issues such 
as vowel reduction, linking and other connected speech phenomena. In Push the blob 
users have to recognise the stressed syllables and to ‘push the blob through a hole that 
matches the correct stress pattern’. 

Figure 5. 



As for the model of English they enforce, apps like EFP or Sounds offer users the 
possibility to choose between British or American English, PF focuses on British 
English and the rest on American English. With regard to the type of feedback 
offered, it is usually a tick or a cross indicating whether the answer is correct or 
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not. The correct answer is shown and sometimes a sound is also played. In the case 
of PF, some activities encourage users to read phonemic transcriptions aloud and 
hear the correct pronunciation afterwards, thus offering a different type of correction; 
however, none of the apps measure whether users actually ‘pronounce’ correctly. 

One final issue that is paramount in this type of courseware is that the order of 
questions is not the same every time users access the app, since otherwise users could 
memorise the correct answer. Although CS , EFP, and Sounds do change the order of 
questions every time users enter the app, the correct response is always the same. 

3. Suggestions for future app development 

Despite the enormous potential that some of these apps show in order to help users 
‘understand’ English sounds and phonemes - a pre-requisite and the first step towards 
self-evaluation and autonomous learning -, more attention should be devoted to 
suprasegmental features and their functions. Apps could include dialogues illustrating 
issues such as sentence stress or intonation, or video-quizzes to test a speaker’s attitude. 
Furthermore, apps aimed at production should provide some type of feedback. Apps like 
Dragon Dictation could be improved and exploited to this end. This app works with 
speech recognition software which transcribes everything users say; thus, dialogues 
could be created where users speak to their phones and see their feedback written. If the 
machine understands them, the transcription will show what users say, otherwise, users 
should easily be able to spot what the problem was based on the transcription (e.g., Can 
you pass me the Ben, please? -instead of ‘pen’). Users should always know why they 
have made the mistake and, if possible, be given suggestions for improvement (see 
Levis, 2007; Neri et al., 2002). SIRI, iPhone s virtual assistant, which also uses speech 
recognition, could be similarly exploited for communicative practice. 

Additionally, activities could make use of authentic materials in order to check that 
users really understand what they learn; for instance, they could incorporate a function 
by which users listened to podcasts and had to look for certain sounds or pronunciation 
features (elisions, assimilations, etc.). 

To conclude, simple explanations illustrating differences between the phonological 
system of English and that of the users’ LI might be useful, preferably reinforced with 
sound discrimination practice. Many users will assume that an English /t/ will be the 
same as a /t/ sound in their LI, or that intonation patterns convey the same information 
in both languages, when this is not necessarily the case. 
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